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SEQUENCE LISTING 
(1) GENERAL INFORMATION 



(i) APPLICANT: Yanagisawa, Masashi 
Bergsma, Derk 
Wilson, Shelagh 
Brooks, David 
Gellai, Miklos 



(ii) TITLE OF THE INVENTION: NOVEL LIGANDS OF THE NEUROPEPTIDE 

RECEPTOR HFGAN72 



(iii) NUMBER OF SEQUENCES: 21 



0 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SmithKline Beecham Corporation RlEO^^^~D 

(B) STREET: 7 09 Swedeland Road <*FD \ K 19^9 

(C) CITY: King of Prussia 

(D) STATE: PA- GROUP 1890 

(E) COUNTRY: United States of America 
\ (F) ZIP: 19406 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 



<vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/938,548 

(B) FILING DATE: 26-SEPT-1997 

( C ) CLASSIFICATION : 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/887,382 

(B) FILING DATE: 2 -JUL- 199 7 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/820,519 
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(B) FILING DATE: 19-MAR-1997 

(A) APPLICATION NUMBER: 60/033,604 

(B) FILING DATE: 17-DEC-1997 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Elizabeth J. Hecht 

(B) REGISTRATION NUMBER: 41,824 

<C) REFERENCE /DOCKET NUMBER: ATG5003 7-2 

<ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 610-270-5009 

(B) TELEFAX: 610-270-5090 

(C) TELEX: 

(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1970 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 



AAAACATAAT 


GTGGGTC TCG 


CGTCTGCCTC 


TCTCCCGCCC 


CTAATTAGCA 


GCTGCCTCCC 


60 


TCCATATTGT 


CCCAGGCCAG 


CGCTTCTTTT 


GTGCTCCCAG 


ATTCCTGGGT 


GCAAGGTGGC 


120 


CTCATTAGTG 


CCCGGAGACC 


GCCCCATCTC 


CAGGGAGCAG 


ATAGACAGAC 


AAGGGGGTGA 


180 


TCAGGGGCAC 


AGTGATC CAA 


CCCTGGCCTC 


TGAACGCCGC 


AGCGGCCATT 


CCTTGGGCCC 


240 


AGCCTGGAGA 


CGGCCCCCCT 


GCAGCAGGCT 


AATCTTAGAC 


TTGCCTTTGT 


CTGGCCTGGG 


300 


TGTGGACGCA 


ATGTGCCTGT 


CAATTCCCCG 


CCACCTCAGA 


GCACTATAAA 


CCCCAGACCC 


360 


CTGGGAGTGG 


GTCACAATTG 


ACAGCCTCAA 


GGTTCCTGGC 


TTTTTGAACC 


ACCACAGACX 


420 


TCTCCTTTCC 


CGGCTACCCC 


ACCCTGAGCG 


CCAGACACCA 


TGAACCTTCC 


TTCCACAAAG 


480 


GTAAAGATCC 


AGGGATGGAG 


GGGTGACTCA 


GCCATCCCAG 


AGGAAGCAAA 


AAGAGTGCTT 


540 


GCTCAGAGGG 


C TGG AAG AAA 


GGCCAAAGGT 


GTCTCCACTC 


TTGGTCTTTT 


CCTGGGTGTG 


600 


CTCTGAGGCA 


GGAGCACCTG 


CCTTGGCTCA 


CATTGGGTTG 


GGTGCTGTTT 


TGCTAAGAGC 


660 


CTGTGTTTGC 


TGAGCTCATA 


TGTGTCAGGT 


GCTCCGTTTG 


CACCTGTCAT 


CTCTTGTCAT 


720 


CCTCCCAACA 


GCCTTGCAGA 


GTAGAAATTA 


TTTCTAGTAT 


ACCCAGTTTA 


CAGGTAAGGG 


780 
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lis 




AGCTGTGCCC 


TCTGAAAGGG 


CAGGAAACTG 


GTTCAAAGCA 


ACGGAGTTCA 


GTCACTCCTG 


840 


CAAGGGGGCA 


GGCAGATGAG 


AGAGCATTCT 


GGAGTCTTGC 


TAGTTCCTGA 


TTTC C ATGTG 


900 


TTTCCCTGCT 


GTGGAGAGGA 


AGTTGGGGGG 


ACTCAGTAGG 


GCCCGGGTTT 


TTCCCAAGTT 


960 


TACAACTTCT 


GCTGCAGACA 


GACACTCCTG 


TTTTCAGGTG 


GAGTGGCAAG 


TGCCCTAGTG 


1020 


GTGGCAACAG 


TGGCCTAAGT 


CTCCAGAGAA 


AAGGGGGATT 


CACTCTGCCC 


AGGGGGTCTC 


1080 


AAAAGGCTTC 


CTGTGGGAGA 


TGCTCTGCTG 


GGTCTTGAAG 


GAGGAGCAGG 


GAAAGTAGGC 


1140 


CGATACCAGC 


AAGGGCGCAA 


AGCAAGGAGA 


ACTAAGTGAC 


AGCCAGAAAG 


GAGTGCAGGC 


1200 


TTGGAGGGGG 


CGCGGAGCCA 


GAGGGGCAGG 


TCCTGTGCGT 


GGGAGCTGGT 


GGCGGGCGCC 


1260 


GTGGGAAGAC 


CCCCCCAGCG 


CCCTGTCTCC 


GTCTCCCTAG 


GTCTCCTGGG 


CCGCCGTGAC 


1320 


GCTACTGCTG 


CTGCTGCTGC 


TGCTGCCGCC 


CGCGCTGTTG 


TCGTCCGGGG 


CGGCTGCACA 


1380 


GCCCCTGCCC 


GACTGCTGTC 


GTCAAAAGAC 


TTGCTCTTGC 


.CGCCTCTACG 


AGCTGCTGCA 


1440 


CGGCGCGGGC 


AATCACGCGG 


CCGGCATCCT 


CACGCTGGGC 


AAGCGGAGGT 


CCGGGCCCCC 


1500 


GGGCCTCCAG 


GGTCGGCTGC 


AGCGCCTCCT 


GCAGGCCAGC 


GGCAACCACG 


CCGCGGGCAT 


1560 


CCTGACCATG 


GGCCGCCGCG 


CAGGCGCAGA 


GCCAGCGCCG 


CGCCCCTGCC 


TCGGGCGCCG 


1620 


CTGTTCCGCC 


CCGGCCGCCG 


CCTCCGTCGC 


GCCCGGAGGA 


CAGTCCGGGA 


TCTGAGTCGT 


1680 


TCTTCGGGCC 


CTGTCCTGGC 


CCAGGCCTCT 


GCCCTCTGCC 


CACCCAGCGT 


CAGCCCCCAG 


1740 


AAAAAAGGCA 


ATAAAGACGA 


GTCTCCATTC 


GTGTGACTGG 


TCTCTGTTCC 


TGTGCGGTCG 


1800 


CGTCCTGCCC 


ATCCGGGGTG 


GCAAAGCGTC 


TTGCGGAGGA 


CAGCTGGGCC 


TGGAAGCCCG 


1860 


GCTGTCGGGC 


ACCAGCCTTA 


GCTTTTGCGT 


GGTTGAATCG 


GAAACACTCT 


TGGTTGGGGA 


1920 


GTTCCCAGTG 


CAAGGCCCTG 


GGGCACAGAG 


AGAACTGCAC 


AGGTGCATGC 




1970 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Asn Leu Pro Ser Thr Lys Val Ser Trp Ala Ala Val Thr Leu Leu 

15 10 15 

Leu Leu Leu Leu Leu Leu Pro Pro Ala Leu Leu Ser Ser Gly Ala Ala 

20 25 30 

Ala Gin Pro Leu Pro Asp Cys Cys Arg Gin Lys Thr Cys Ser Cys Arg 
35 40 45 
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Leu Tyr Glu Leu Leu His Gly Ala Gly Asn His Ala Ala Gly lie Leu 
50 55 60 

Thr Leu Gly Lys Arg Arg Ser Gly Pro Pro Gly Leu Gin Gly Arg Leu 
65 f 70 75 80 

Gin Arg Leu Leu Gin Ala Ser Gly Asn His Ala Ala Gly lie Leu Thr 

85 90 95 

Met Gly Arg Arg Ala Gly Ala Glu Pro Ala Pro Arg Pro Cys Leu Gly 

100 105 110 

Arg Arg Cys Ser Ala Pro Ala Ala Ala Ser Val Ala Pro Gly Gly Gin 

115 120 125 

Ser Gly lie 
130 



(2) INFORMATION FOR SEQ ID NO : 3 : 



CL7 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

Gin Pro Leu Pro Asp Cys Cys Arg Gin Lys Thr Cys Ser Cys Arg Leu 

1 5 10 15 

Tyr Glu Leu Leu His Gly Ala Gly Asn His Ala Ala Gly lie Leu Thr 
20 25 30 

Leu 



(2) INFORMATION FOR SEQ ID NO : 4 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Arg Ser Gly Pro Pro Gly Leu Gin Gly Arg Leu Gin Arg Leu Leu Gin 

15 10 15 

Ala Ser Gly Asn His Ala Ala Gly lie Leu Thr Met 
20 25 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 585 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



,GGCTCGGCGG 


CCTCAGACTC 


CTTGGGTATT 


TGGACCACTG 


CACCGAAGAT 


ACCATCTCTC 


60 


CGGATTGCCT 


CTCCCTGAGC 


TCCAGACACC 


ATGAACCTTC 


C TTC TAC AAA 


GGTTCCCTGG 


120 


GCCGCCGTGA 


CGCTGCTGCT 


GCTGCTACTG 


CTGCCGCCGG 


CGCTGCTGTC 


GCTTGGGGTG 


180 


GACGCGCAGC 


CTCTGCCCGA 


CTGCTGTCGC 


CAGAAGACGT 


GTTCCTGCCG 


TCTCTACGAA 


240 


CTGTTGCACG 


GAGCTGGCAA 


CCACGCCGCG 


GGCATCCTCA 


CTCTGGGAAA 


GCGGCGACCT 


300 


GGACCCCCAG 


GCCTCCAAGG 


ACGGCTGCAG 


CGCCTCCTTC 


AGGCCAACGG 


TAACCACGCA 


360 


GCTGGCATCC 


TGACCATGGG 


CCGCCGCGCA 


GGCGCAGAGC 


TAGAGCCATA 


TCCCTGCCCT 


420 


GGTCGCCGCT 


GTCCGACTGC 


AACCGCCACC 


GCTTTAGCGC 


CCCGGGGCGG 


ATCCAGAGTC 


480 


TGAACCCGTC 


TTCTATCCCT 


GTCCTAGTCC 


TAACTTTCCC 


CTCTCCTCGC 


CGGTCCCTAG 


540 


GCAATAAAGA 


CGTTTCTCTG 


CTAAAAAAAA 


AAAAAAAAAA 


AAAAA 




585 



( 2 ) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



3^ 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Met Asn Leu Pro Ser Thr Lys Val Pro Trp Ala Ala Val Thr Leu Leu 

15 10 15 

Leu Leu Leu Leu Leu Pro Pro Ala Leu Leu Ser Leu Gly Val Asp Ala 

20 25 30 

Gin Pro Leu Pro Asp Cys Cys Arg Gin Lys Thr Cys Ser Cys Arg Leu 

35 40 45 

Tyr Glu Leu Leu His Gly Ala Gly Asn His Ala Ala Gly lie Leu Thr 

50 55 60 

Leu Gly Lys Arg Arg Pro Gly Pro Pro Gly Leu Gin Gly Arg Leu Gin 
65 70 75 80 

Arg Leu Leu Gin Ala Asn Gly Asn His Ala Ala Gly lie Leu Thr Met 

85 90 95 

Gly Arg Arg Ala Gly Ala Glu Leu Glu Pro Tyr Pro Cys Pro Gly Arg 

100 105 110 

Arg Cys Pro Thr Ala Thr Ala Thr Ala Leu Ala Pro Arg Gly Gly Ser 
115 120 125 

Arg Val 
130 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 2 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Met Asn Leu Pro Ser Thr Lys Val Pro Trp Ala Ala Val Thr Leu Leu 

15 10 15 

Leu Leu Leu Leu Leu Pro Pro Ala Leu Leu Ser Leu Gly Val Asp Ala 
20 25 30 

(2) INFORMATION FOR SEQ ID NO : 8 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

Gin Pro Leu Pro Asp Cys Cys Arg Gin Lys Thr Cys Ser Cys Arg Leu 

15 10 15 

Tyr Glu Leu Leu His Gly Ala Gly Asn His Ala Ala Gly lie Leu Thr 



(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

Arg Pro Gly Pro Pro Gly Leu Gin Gly Arg Leu Gin Arg Leu Leu Gin 

15 10 15 

Ala Asn Gly Asn His Ala Ala Gly lie Leu Thr Met 



20 



25 



30 



Leu 



20 



25 



(2) INFORMATION FOR SEQ ID NO: 10: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 123 amino acids 

(B) TYPE: amino acid 



t 
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\ 

0 



(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Val Pro Trp Ala Ala Val Thr Leu Leu Leu Leu Leu Leu Leu Pro Pro 

15 10 15 

Ala Leu Leu Ser Leu Gly Val Asp Ala Gin Pro Leu Pro Asp Cys Cys 

20 25 30 

Arg Gin Lys Thr Cys Ser Cys Arg Leu Tyr Glu Leu Leu His Gly Ala 

35 40' 45 

Gly Asn His Ala Ala Gly lie Leu Thr Leu Gly Lys Arg Arg Pro Gly 

50 55 60 

Pro Pro Gly Leu Gin Gly Arg Leu Gin Arg Leu Leu Gin Ala Asn Gly 
65 70 75 80 

Asn His Ala Ala Gly lie Leu Thr Met Gly Arg Arg Ala Gly Ala Glu 

85 90 95 

Leu Glu Pro His Pro Cys Ser Gly Arg Gly Cys Pro Thr Val Thr Thr 
100 105 110 



(^y Thr Ala Leu Ala Pro Arg Gly Gly Ser Gly Val 



115 120 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Gin Pro Leu Pro Asp Cys Cys Arg Gin Lys Thr Cys Ser Cys Arg Leu 

1 5 10 15 

Tyr Glu Leu Leu His Gly Ala Gly Asn His Ala Ala Gly lie Leu Thr 
20 25 30 



c 

& 
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Leu 



32 




(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Arg Pro Gly Pro Pro Gly Leu Gin Gly Arg Leu Gin Arg Leu Leu Gin 

1- 5 10 15 

Ala Asn Gly Asn His Ala Ala Gly lie Leu Thr Met 
20 25 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CAACCNCTNC CNGACTGCTG 2 0 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 




f 
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(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 14 : 



ATNCCNGCNG CATGATT 



17 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 4 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

CGGCAGGAAC ACGTCTTCTG GCG 23 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 3 0 base pairs 




34 




# 
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(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 
TCCTTGGGTA TTTGGACCAC TGCACCGAAG 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 
ATACCATCTC TCCGGATTGC CTCTCCCTGA 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 
CCTCTGAAGG TTCCAGAATC GATAGTAN 




(ii) MOLECULE TYPE: cDNA 



(2) INFORMATION FOR SEQ ID NO: 20: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
CCTCTGAAGG TTCCAGAATC GATAG 
(2) INFORMATION FOR SEQ ID NO: 21: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 577 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



CACAATTGAC 


AGCCTCAAGG 


TTCCTGGCTT 


TTTGAACCAC 


C AC AG AC AT C 


TCCTTTCCCG 


GCTACCCCAC 


CCTGAGCGCC 


AG AC AC C ATG 


AACCTTCCTT 


CCACAAAGGT 


CTCCTGGGCC 


GCCGTGACGC 


TACTGCTGCT 


GCTGCTGCTG 


CTGCCGCCCG 


CGCTGTTGTC 


GTCCGGGGCG 


GCTGCACAGC 


CCCTGCCCGA 


CTGCTGTCGT 


CAAAAGACTT 


GCTCTTGCCG 


CCTCTACGAG 


CTGCTGCACG 


GCGCGGGCAA 


TCACGCGGCC 


GGCATCCTCA 


CGCTGGGCAA 


GCGGAGGTCC 


GGGCCCCCGG 


GCCTCCAGGG 


TCGGCTGCAG 


CGCCTCCTGC 


AGGCCAGCGG 


CAACCACGCC 


GCGGGCATCC 


TGACCATGGG 


CCGCCGCGCA 


GGCGCAGAGC 


CAGCGCCGCG 


CCCCTGCCTC 


GGGCGCCGCT 


GTTCCGCCCC 


GGCCGCCGCC 


TCCGTCGCGC 


CCGGAGGACA 


GTCCGGGATC 


TGAGTCGTTC 


TTCGGGCCCT 


GTCCTGGCCC 


AGGCCTCTGC 


CCTCTGCCCA 


CCCAGCGTCA 


GCCCCCAGAA 


AAAAGGCAAT 


AAAGACGAGT 


CTCCATT 







60 
120 
180 
240 
300 
360 
420 
480 
540 
577 




(ii) 



MOLECULE TYPE: cDNA 




43 



